Skip to content

Auto-respond to PR status changes (#201)#214

Open
dhilgaertner wants to merge 1 commit intomainfrom
feature/crow-201-auto-respond-pr-status
Open

Auto-respond to PR status changes (#201)#214
dhilgaertner wants to merge 1 commit intomainfrom
feature/crow-201-auto-respond-pr-status

Conversation

@dhilgaertner
Copy link
Copy Markdown
Contributor

Summary

  • Detects PR transitions into changesRequested / checksFailing by comparing each polling cycle's PRStatus against the previous one — piggybacks on the existing 60s GitHub GraphQL query, no new API calls or scopes.
  • When the user opts in (Settings → Notifications → Auto-respond), Crow types a short prompt into the session's managed Claude Code terminal asking Claude to fetch the review/logs (gh/glab) and address them. Crow doesn't fetch review bodies or CI logs itself — that keeps the implementation simple and avoids fragile log parsing.
  • Fires a macOS notification on every transition (gated by the existing notification settings) so the user is informed even with auto-respond off.
  • Dedupe key is (sessionID, kind, headSha) so re-runs on the same commit don't re-fire; auto-rearms when the rule clears (approved → changesRequested again, or new commit also fails).
  • Both opt-in toggles default off — typing into a terminal unprompted is intrusive.
  • First poll never fires (matches the existing previousReviewRequestIDs first-fetch gate) so existing PR state isn't replayed at startup.

Implementation

  • New PRStatusTransition value type + pure PRStatus.transitions(from:to:...) helper in CrowCore — keeps the comparison logic unit-testable.
  • Added headSha to PRStatus (populated from a new headRefOid field on ViewerPR / GraphQL).
  • Two new NotificationEvent cases (changesRequested, checksFailing) — they automatically inherit NotificationSettings's per-event sound/system-notification toggles.
  • New AutoRespondSettings struct on AppConfig (separate from NotificationSettings because auto-respond is an action, not a notification).
  • New AutoRespondCoordinator in the main target — finds the session's managed terminal via Terminal.isManaged and sends via TerminalManager.shared.send(id:text:), the same path crow send uses.
  • IssueTracker.applyPRStatuses rewritten to track previousPRStatus[sessionID] and emit transitions through a new onPRStatusTransitions closure (mirrors onNewReviewRequests).
  • NotificationManager.notifyPRTransition reuses the same gate logic as notifyReviewRequest.

Closes #201

Test plan

  • make build succeeds
  • make test — all 127 CrowCore tests + 31 root CrowTests + every other package suite pass
  • New PRStatus transitions test suite (14 tests) covers: first-observation never fires, single-fire on transition, no-op on repeat poll, no-fire when rule clears, simultaneous transitions, dedupe key shape (with vs without SHA)
  • Existing ViewerPR test helpers updated for new headRefOid field
  • Manual smoke (recommend running once after merge): enable Respond to failed CI checks, push a CI-breaking commit on a session-linked PR, verify notification fires within ~60s and prompt appears in the managed terminal; re-run CI on same SHA → no second prompt; push new failing SHA → second prompt fires; disable toggle → notification only

🤖 Generated with Claude Code

@dhilgaertner dhilgaertner requested a review from dgershman as a code owner April 24, 2026 22:41
Copy link
Copy Markdown
Collaborator

@dgershman dgershman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code & Security Review

Critical Issues

None found.

Security Review

Strengths:

  • Auto-respond toggles default off — good security posture for a feature that types into a terminal unsolicited.
  • Dedupe key design (sessionID|kind / sessionID|kind|headSha) prevents replay attacks from re-polling the same state.
  • The coordinator checks for both a managed terminal AND an initialized surface before dispatching — prevents sending to stale/orphaned terminals.
  • No new API scopes or tokens — piggybacks on the existing gh/glab GraphQL payload and delegates review/CI log fetching to Claude, keeping the Crow app's privilege boundary unchanged.
  • The headRefOid field is sourced from GitHub's GraphQL schema, not user input, so no injection risk from SHA values.
  • Prompt text is constructed with string interpolation of PR numbers/URLs (already validated upstream via ProviderManager.parseTicketURLComponents), not raw user input.

Concerns:

  • None significant. The prompts injected into the terminal are deterministic and constructed from trusted state (session metadata, PR metadata from GitHub API).

Code Quality

Well done:

  • Clean separation: PRStatusTransition is a pure value type in CrowCore, PRStatus.transitions(from:to:...) is a pure function — both are unit-testable without any UI/AppState dependency.
  • 14-test suite (PRStatusTransitionTests) thoroughly covers the transition matrix: first-observation gate, directional transitions, no-ops, simultaneous transitions, and dedupe key semantics.
  • AutoRespondSettings is separate from NotificationSettings — correct modeling since auto-respond is an action, not a notification.
  • Forward-compatible Codable with decodeIfPresent + defaults on AppConfig, AutoRespondSettings, and PRStatus.
  • Re-arm logic in applyPRStatuses (lines 1326–1334 of IssueTracker.swift) correctly clears emitted keys when the triggering condition clears, so future re-entries fire as expected.
  • The settingsProvider closure pattern on AutoRespondCoordinator means Settings UI toggles take effect on the next transition without explicit wiring — simple and correct.
  • NotificationSettingsView cleanly adds the auto-respond section with appropriate explanatory caption text.
  • Prompt construction in AutoRespondPrompts.build handles both GitHub and GitLab providers with appropriate CLI hints.

Minor observations (non-blocking):

  • AutoRespondPrompts.build uses transition.prNumber.map(String.init) for the inline comments hint but falls back to "<number>" as a placeholder when prNumber is nil (line 82). In practice prNumber is always populated from ViewerPR.number, so this is a safe defensive path rather than a real concern.
  • The failedCheckNames.prefix(5) truncation in the prompt (line 95) is a nice touch to avoid overwhelming Claude with a massive list.

Summary Table

Priority Issue
Green Consider adding an integration test that exercises the full IssueTracker.applyPRStatuses → onPRStatusTransitions → AutoRespondCoordinator.handle pipeline end-to-end (currently each piece is tested in isolation, which is fine)
Green The AutoRespondCoordinator could log when a transition is skipped due to the toggle being off (currently it silently continues) — helpful for debugging "why didn't auto-respond fire?"

Recommendation: Approve — this is a well-structured, defensively coded feature with good test coverage, appropriate opt-in defaults, and no security concerns. The pure/stateful split keeps the logic testable and the implementation stays within Crow's existing privilege boundary by delegating review/CI log fetching to Claude.

When a watched PR transitions into 'changes requested' or 'CI failing',
type a short prompt into the session's managed Claude Code terminal so
Claude can read the review or logs and address it without the user
having to switch sessions and dictate the fix. Both behaviors are off by
default and toggle from Settings → Notifications. macOS notifications
fire on every transition; the terminal injection only fires when opted
in. Detection piggybacks on the existing 60s GitHub poll — no new API
calls, no new scopes — and dedupes per (session, kind, head SHA) so
re-runs on the same commit don't re-fire.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dhilgaertner dhilgaertner force-pushed the feature/crow-201-auto-respond-pr-status branch from 3d34ddb to a13cd54 Compare April 25, 2026 03:36
Copy link
Copy Markdown
Collaborator

@dgershman dgershman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code & Security Review

Critical Issues

None found.

Security Review

Strengths:

  • Auto-respond is off by default — typing into a terminal unprompted is correctly treated as an intrusive action, requiring explicit opt-in per transition type.
  • Crow delegates all review/CI log fetching to Claude via gh/glab rather than fetching and parsing them itself — avoids new API scopes and keeps the attack surface small.
  • The prompt text is constructed from data Crow already has in memory (PR URL, PR number, check names), not from user-controlled free-text fields like review bodies. This limits injection risk in the terminal prompt.
  • The settingsProvider closure reads settings lazily, so a toggle-off takes effect on the very next poll cycle — no stale-permission window.
  • @MainActor on AutoRespondCoordinator serializes all dispatch calls, avoiding TOCTOU races between the settings check and the terminal send.

Concerns:

  • (Low) AutoRespondPrompts.build interpolates transition.prURL directly into the prompt string. Since prURL comes from the GitHub GraphQL API (not user input), this is safe in practice — but if the app ever ingests PR URLs from other sources (e.g. user-pasted links), a malicious URL containing backticks or shell metacharacters could be interpreted by the terminal. The current trust boundary is fine; just worth noting.

Code Quality

Well done:

  • Clean separation of concerns: pure PRStatus.transitions() for logic, IssueTracker for stateful dedupe, AutoRespondCoordinator for dispatch, NotificationManager for user-facing alerts. Each is independently testable.
  • The dedupe key design is thoughtful — changesRequested keys on (session, kind) to be SHA-agnostic, while checksFailing includes SHA so a new failing commit re-fires. The test suite covers both.
  • The re-arm logic in applyPRStatuses correctly clears emitted keys when the triggering condition clears, preventing stale dedupe entries from accumulating.
  • 14 new transition tests with good coverage of edge cases: first-observation gate, no-op on repeat, re-arm, simultaneous transitions, dedupe key shape.
  • Backward-compatible Codable with decodeIfPresent for both AutoRespondSettings and the new headSha on PRStatus.

Minor observations:

  • IssueTracker.swift:1413-1416 — the re-arm for checksFailing guards on old.headSha != nil, but the dedupe key for headSha == nil is "…|checksFailing|" (empty string). If checks transition to failing with headSha == nil, that key will be emitted but never re-armed. In practice headSha is always populated from headRefOid in the GraphQL response, so this is cosmetic — but adding an else branch to also remove the empty-SHA key would make the logic fully symmetric.
  • AutoRespondPrompts.build line 82: the GitHub inline-comments hint includes a literal {owner}/{repo} placeholder. Claude will figure it out, but interpolating the actual owner/repo from transition.prURL would make the prompt more actionable. Very minor.

Summary Table

Priority Issue
🟢 Green Re-arm doesn't cover headSha == nil edge case (cosmetic — SHA always present in practice)
🟢 Green GitHub inline-comments gh api hint uses literal {owner}/{repo} placeholder

Recommendation: Approve — this is a clean, well-tested feature with solid architecture, good defaults (off by default), and no blocking issues. The two green items are cosmetic and can be addressed in a follow-up if desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-detect and respond to PR status changes (changes requested, failed pipeline)

2 participants